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Abstract 

The Gallager bound is well known in the area of channel coding. However, most discussions about it mainly focus on its 
applications to memoryless channels. We show in this paper that the bounds obtained by Gallager's method are very tight even for 
general sources and channels that are defined in the information-spectrum theory. Our method is mainly based on the estimations 

■ of error exponents in those bounds, and by these estimations we proved the direct part of the Slepian-Wolf theorem and channel 
^ ' coding theorem for general sources and channels. 

I. Introduction 

Ph ■ 

In his paper [1] in 1965, Gallager developed a simple inequality technique to derive the coding theorem for memoryless 
channels without resorting to the law of large numbers. Its central idea may be summarized as the following two basic 
inequalities. 

l{x' >x}< f ^-J , x > 0, x' > 0, s > 

Q . min{x, 1} < x p , x > 0, < p < 1 

Although this method is also applicable to channels with memory, Gallager and other researchers mainly concentrated on 
its applications to memoryless channels. To the best of our knowledge, no one has investigated the extensions of Gallager's 
, ■ method to general channels so far. 

Recently, inspired by Gallager's method and its development [l]-[3], we derived a similar upper bound on the average 

■ probability of maximum a posterior (MAP) decoding error of Slepian-Wolf codes, and we proved the direct part of Slepian- 
| Wolf theorem for general sources [4], [5], Compared with the result obtained by information-spectrum methods [6], our proof 

is slightly weaker since we assume that the alphabets of correlated sources are finite, but it does suggest that Gallager's method 
may be applicable to general sources and channels defined in the framework of information-spectrum theory [7]. Following 
the idea in [4], we will show in this paper that Gallager's method is applicable to general sources and channels. 

czi . 

^ ' II. Definitions and Notations 

. £h ^ general source in the information-spectrum theory [7] is defined as an infinite sequence 

x = {x n = (x[ n \x^\---,x^)}^ =1 

■ - - 1 of n-dimensional random variables X n where each component random variable x\ n ^ (1 < i < ri) takes values in the alphabet 
X (finite or countably infinite). Analogously, we can define the general correlated sources XY as an infinite sequence 

XY = {X n Y n = (x[ n) Y} n) , ■ ■ • .iWyW)}™! 

of n-dimensional random variables X n Y n where each component random variable X^Y^ = (X^ , K ) (1 < i < ri) 
takes values in the product alphabet X x y. We denote the sample space and sample sequence of the n-dimensional random 
variables X n Y n , X n and Y n by X n x y n , X n , y n and x n y n , x n , y n respectively. 
Let W n = W n (-\-) be an arbitrary conditional probability distribution satisfying 

W n (y n \x n ) = 1, \/x n e X n 

for each n = 1, 2, • • •. We call the sequence W = {W n }^ =1 a general channel. 

For convenience, we also use the notations Px(x) and Px\y( x \u) t0 substitute for Pr{X — x} and Pr{X = x\Y — y} 
respectively. 



III. Main Results 

Anyone who is familiar with Gallager's method knows that almost all the results in [1] are obtained by analyzing the 



properties of the function Eq\p, X) (0 < p < 1) defined by 



"^ ln E ( E PxAx n )W n {y n \x n )^ j 



1+p 



(1) 



With the assumption that the input and channel are memoryless and stationary, the function Q can be reduced to 

i+p 



yey \xex I 



Then it is irrelevant to n, and Gallager used analytic methods to analyze it. However, when we consider a general input source 
X and a general channel W, the property of the function Q becomes complex since it may change with n and may converge 
to zero for any p £ (0, 1], and hence Gallager's method are not valid any more. To solve it, we adopted a different method 
based on estimations of the function Q. We proved the following theorem. 

Theorem 1: Let X be a general source and let W be a general channel, then for any < 6 < I_(X\ Y), there exists a 
sequence of p n defined by 

-\ lne„(<5) 



p n = min{ 



n(I(X;Y)-6) 



,1}, 



such that for any n > 1, 



E ( n) ( Pn ,X) >p n (l(X;Y)-5) 



3 In 2 



where Y is the output of the channel and £(X; Y) is the spectral inf-mutual information rate defined by 

sup < (3 



and 



1 W n (V n I X n ) 
en(S) = Pr{ - In ^_L_ I < L{X , Y) _ S }. 



Proof: Without loss of generality, we assume that iY«(2/") > for all y n £ y n , and we define 

A(y n ,5) = {x n er 
hence it follows from the definition (0) and that 



We further define 



and 



n P Y ™(y n ) 
V Px»y»(A(i,",*) c ,y") - e n (5), lim e n (5) = 

* — * n— >oo 

(S) = {y n £ J" \p XnlYn (A(y n ,Sr\y n )<e n (S)i } , 



y"ey 
B 



iv»wr)= E 

e„(<5)5 



* E 

y"£B(5)c 



< 



(2) 
(3) 

(4) 
(5) 

(6) 



(7) 



(8) 



Then we have 



exp{-n£^ >(p ni X)} 



E ( E Pxn(x n )^P XnYn (x n y n )^^j 



< E *v»(w n ) + E E + 

y«6B(5)<= S/"£S(5) L x"eA(y»,5) = 
V Pvn IV n(x n \v n )( P ' y "^"- > 

2^ I!/ Hp (in u n J 

(6) / \ 1 +' 3 '> 

<e„(#+ E ^(y")(Px.My"(^^^) C |y n )~+^^ (I(X;V) ^ ) 

(c) / i n P „(t(x ; y)-i) \ 1 

< E p Y<y n )Un(S)^^ +e — J +e n {5)- 

v n eB(S) ^ ' 

( | e -np„(/(X;V)- d -)+21„2 +en((5) i 
(e) 

— c ) 

where (a) follows from Holder's inequality, and (b) follows from (|6j, (|8} and Holder's inequality, and (c) follows from (0, 
and (d) and (e) from This concludes □ 
By Theorem H we can easily prove the direct part of the coding theorem for general channels just by showing the following 
fact. 

Corollary 1: Let W be a general channel, if the coding rate 

R < C(W) = sup /(X; Y), (9) 
x 

then the function 

E (n) (R) = max max {Ei n \p,X) - pR\ (10) 

X" 0<p<l 

satisfies nE^ n \R) — > oo as n — > oo. 

Proof: For any rate i? < C(W"), there exists a general source X such that R < I_(X;Y). Then by Theorem ^ we have 
for any < S < I(X; Y), 

nE^(R)>n(E ( n \p n ,X)-p n R) 

>np n (I(X;Y) -S-R) -3 In 2 (11) 

where p„ is defined by 0. Because np n — > oo as n — » oo, the lower bound (II 1> goes to infinity as n — > oo for sufficiently 
small 6, and this concludes the corollary. □ 
Remark 1: An intermediate result in the proof of Theorem 1 in [1] is that there exists some codes with rate R such that 
the average probability of maximum likelihood (ML) decoding error satisfies 

< exp{-nE^ n \R)}, 

and hence the direct part of the coding theorem for general channels is proved. 

In [4], we obtained a similar upper bound on the average probability of maximum a posterior (MAP) decoding error of 
Slepian-Wolf codes for general sources. There are three terms in the upper bound, and a typical form of the error exponents 
of these terms may be formulated as follows. 

J^(R) = max {pR - J$ n) (p)}, (12) 

0<p<l 

where 

/ \ 
4 n \p) = -lnJ2 E Px»Yn(x n v n )&\ . (13) 
n y n &y n \x n ex n / 



By Corollary |2 to be proved in the sequel, we proved the direct part of the Slepian-Wolf theorem. Later, we found that Gallager 
had already obtained similar bounds for single sources in [8, Exercise 5.16]. Of course, he only discussed the properties of 
the bound for stationary memoryless sources. For general sources with finite alphabets, we have the following theorem on the 
property of Jl 31 . 

Theorem 2: Let XY be a general correlated sources satisfying \X\ < oo, then for any 6 > 0, there exists a sequence of 
p n defined by 

-|ln e n (S) 



1} 



■n(ln|Af| -H(X\Y) -5) 
for H(X\Y) < In \X\ - 5 (and p„ = 1 for H{X\Y) > In \X\ - 5) such that for any n > 1, 

4 n \p)<Pn(H(X\Y)+6) + ^, 



where H(X\Y) is the spectral conditional sup-entropy rate defined by 



inf < a 



lim Pr{ — In 



1 



and 



1 

» ln Px^(X n \Y n ) 
1 



> a}= j 



^S)=Pr{~* PxniYn{XnlYn) >H(X\Y) + 5}. 
Proof: Without loss of generality, we assume that Pyn(y n ) > for all y" £ y n , and we define 



n P X n lY n(x n \y n ) 



—— < H(X\Y) + S 



A(y n ,6) = \x n eX r - 

hence it follows from the definition d!6t and dl7> that 

V P X n Yn (A(y n ,5) c ,y n ) = e n (S), lim e n (S) = 0. 

' * n—>ao 

Analogous to @ and (jHJl, we further define 



Bt 



(5) = {y n ey n \p XnlYn (A(y n ,8y\y n )<e n (6)i }, 



and we have 



Then we have 



Py4B(S) c ) <e n (6Y- 



exp{nJ (ra) (p„)} 



E p y<y n )( E Px^(x n \y n )^+ E Px»\ Y n(x n \y n )*fc 

y n eB(S) ^x n GA(y n ,8) x n GA(y rl ,S) a 

E ( E PX~Y«(X*V»)T& ' 



1+Pn 



y n £B(8) c x x"£X 



(a) 



+ E l*P>"iV»(i/ n ) 

y n eB(sy 



X n \Y- 



>(A(y n ,S)°\y" 



l+Pn 



ip n {H(X{Y) + 5) np n 1 

l + p„ + \X\ l + p„ g n (^) 2(l + p„) 



1+Pr, 



(14) 

(15) 

(16) 
(17) 

(18) 



(19) 
(20) 



< \Xr<*e n (6) L > + E PY<y n )(e 

y»£B(S) ^ 
( | e „p„(H(X|y)+5)+21n2 + ^jnpn ^ § 
(d) _ 

< e np„(_f/(X|V)+5)+31n2 

where (a) follows from i ll 8t and Jensen's inequality (or Holder's inequality), and (b) follows from dl9l and J20i . and (c) and 
(d) from dHJ. This concludes (Q3J. □ 
In the same way, we have the following corollary. 



Corollary 2: Let XY be a general correlated sources satisfying \X\ < oo, if the coding rate 

R>H(X\Y), (21) 

then the function dl2l satisfies nJ( n >(R) — > oo as n — > oo. 
Proof: It follows from Theorem [2] that for any S > 0, 

nJ<")(i?) >n(p„ii- J (n) (p„)) 

> np„(i?-i?(X|r) -J) -3 In 2, (22) 

where p n is defined by dl4> . Because np n — ► oo as n — ► oo, the lower bound (I22t goes to infinity as n — > oo for sufficiently 
small 5, and this concludes the corollary. □ 

IV. Conclusions and Discussions 

An important fact implied above is that the upper bounds obtained by Gallager's method are very tight even for general 
sources and channels, and we think that stronger version of Theorem ^ and |2] may be obtained by more sophisticated methods. 

The authors want to emphasize the possibility that there may exist some nontrivial and interesting properties about these 
error exponents. For example, let us see the derivative of the function Jl 31 . We have 

dp n 

where the distribution of X n (p)Y n (p) is defined by 

p , n n,_ Px^{x n y n )^{Y] ineXn Px^{x n y n )^) P 

*x n {p)Y n (p)\ x V I — ~ ~ ; i ZT+o ■ 

E fi » 6 ^(£f» e *.JW.(W)^) 

Clearly, X n (0)Y n (0) = X n Y n , that is, X n (0)Y n (0) and X n Y n have the same distribution. Hence for ±H(X n \Y n ) < R < 
±H(X n (l)\Y n (l)), we have 

J^(R) = Po R- 4 n) (p ), 

where po satisfies -^H(X n (po)\Y n (po)) = R. (Here, we omit the technical details.) Obviously, when n is fixed, the random 
variable X n (po)Y n (po) is a function of R, and hence is a simple curve in the space of n-dimensional probability distributions. 
However, when n is considered, the problem becomes complicated due to the complexity of the general correlated source 
XY. Therefore, further investigation of these error exponents is needed. 
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